AITopics | new test

Collaborating Authors

new test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CLIFT: Analysing Natural Distribution Shift on Question Answering Models in Clinical Domain

Pal, Ankit

arXiv.org Artificial IntelligenceOct-19-2023

This paper introduces a new testbed CLIFT (Clinical Shift) for the clinical domain Question-answering task. The testbed includes 7.5k high-quality question answering samples to provide a diverse and reliable benchmark. We performed a comprehensive experimental study and evaluated several QA deep-learning models under the proposed testbed. Despite impressive results on the original test set, the performance degrades when applied to new test sets, which shows the distribution shift. Our findings emphasize the need for and the potential for increasing the robustness of clinical domain models under distributional shifts. The testbed offers one way to track progress in that direction. It also highlights the necessity of adopting evaluation metrics that consider robustness to natural distribution shifts. We plan to expand the corpus by adding more samples and model results. The full paper and the updated benchmark are available at github.com/openlifescience-ai/clift

dataset, distribution shift, natural distribution shift, (14 more...)

arXiv.org Artificial Intelligence

2310.13146

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Middle East > Israel (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.82)

Add feedback

Use cases of Chi-squared test part1(Machine Learning)

#artificialintelligenceApr-1-2023, 19:55:17 GMT

Abstract: Taking the goodness of fit test (Chi test) as an example, this paper attempts to calculate the Bayesian factor BF10 of n-fold Bernoulli test by the Excel software (using JASP software as the evidence). The results showed that in the range of 0.15–0.55 Abstract: The sensitivity of gravitational wave searches is reduced by the presence of non-Gaussian noise in the detector data. These non-Gaussianities often match well with the template waveforms used in matched filter searches, and require signal-consistency tests to distinguish them from astrophysical signals. However, empirically tuning these tests for maximum efficacy is time consuming and limits the complexity of these tests. In this work we demonstrate a framework to use machine-learning techniques to automatically tune signal-consistency tests.

chi-squared test part1, machine learning, signal-consistency test, (6 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.44)

Add feedback

New Tests of Randomness and Independence for Sequences of Observations

#artificialintelligenceOct-25-2021, 22:45:35 GMT

There is no statistical test that assesses whether a sequence of observations, time series, or residuals in a regression model, exhibits independence or not. Typically, what data scientists do is to look at auto-correlations and see whether they are close enough to zero. If the data follows a Gaussian distribution, then absence of auto-correlations implies independence. Here however, we are dealing with non-Gaussian observations. The setting is similar to testing whether a pseudo-random number generator is random enough, or whether the digits of a number such as π behave in a way that looks random, even though the sequence of digits is deterministic.

independence, randomness, sequence, (14 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence (0.49)
Information Technology > Data Science (0.35)

Add feedback

A Fast and Effective Large-Scale Two-Sample Test Based on Kernels

Song, Hoseung, Chen, Hao

arXiv.org Machine LearningOct-6-2021

Kernel two-sample tests have been widely used and the development of efficient methods for high-dimensional large-scale data is gaining more and more attention as we are entering the big data era. However, existing methods, such as the maximum mean discrepancy (MMD) and recently proposed kernel-based tests for large-scale data, are computationally intensive to implement and/or ineffective for some common alternatives for high-dimensional data. In this paper, we propose a new test that exhibits high power for a wide range of alternatives. Moreover, the new test is more robust to high dimensions than existing methods and does not require optimization procedures for the choice of kernel bandwidth and other parameters by data splitting. Numerical studies show that the new approach performs well in both synthetic and real world data.

new test, statistics, two-sample test, (14 more...)

arXiv.org Machine Learning

2110.03118

Country:

North America > United States > California > Yolo County > Davis (0.04)
North America > Canada (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

FDA authorizes new test to detect past Covid-19 infections

#artificialintelligenceMar-7-2021, 08:35:06 GMT

The Food and Drug Administration on Friday issued an emergency authorization for a new test to detect Covid-19 infections -- one that stands apart from the hundreds already authorized. Unlike tests that detect bits of SARS-CoV-2 or antibodies to it, the new test, called T-Detect COVID, looks for signals of past infections in the body's adaptive immune system -- in particular, the T cells that help the body remember what its viral enemies look like. Developed by Seattle-based Adaptive Biotechnologies, it is the first test of its kind. Adaptive's approach involves mapping antigens to their matching receptors on the surface of T cells. They and other researchers had already shown that the cast of T cells floating around in an individual's blood reflects the diseases they've encountered, in many cases years later.

adaptive, covid-19 infection, new test, (13 more...)

#artificialintelligence

Country: North America > United States (0.76)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.76)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Validating Label Consistency in NER Data Annotation

Zeng, Qingkai, Yu, Mengxia, Yu, Wenhao, Jiang, Tianwen, Weninger, Tim, Jiang, Meng

arXiv.org Artificial IntelligenceJan-21-2021

Data annotation plays a crucial role in ensuring your named entity recognition (NER) projects are trained with the right information to learn from. Producing the most accurate labels is a challenge due to the complexity involved with annotation. Label inconsistency between multiple subsets of data annotation (e.g., training set and test set, or multiple training subsets) is an indicator of label mistakes. In this work, we present an empirical method to explore the relationship between label (in-)consistency and NER model performance. It can be used to validate the label consistency (or catches the inconsistency) in multiple sets of NER data annotation. In experiments, our method identified the label inconsistency of test data in SCIERC and CoNLL03 datasets (with 26.7% and 5.4% label mistakes). It validated the consistency in the corrected version of both datasets.

annotation, subset, test subset, (15 more...)

arXiv.org Artificial Intelligence

2101.08698

Country:

Asia > China > Heilongjiang Province > Harbin (0.05)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Google's TensorFlow gets a new test for training data leaks

#artificialintelligenceJul-8-2020, 08:25:01 GMT

Google late last month debuted experimental tests for its TensorFlow Privacy library designed to reduce the degree to which machine learning models leak identifiable personal information in training data sets, such as for biometric facial recognition. The test module enables developers to "assess the privacy properties of their classification models," according to Google. The testing tool is known as a membership inference attack. Obvious applications for the technique include facial recognition and health care. This amounts to a second try for TensorFlow Privacy, which was introduced last year to address the "emerging topic" of privacy in machine learning, Google said.

artificial intelligence, google, machine learning, (7 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.55)

Add feedback

Do ImageNet Classifiers Generalize to ImageNet?

Recht, Benjamin, Roelofs, Rebecca, Schmidt, Ludwig, Shankar, Vaishaal

arXiv.org Machine LearningFeb-13-2019

We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.

accuracy, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1902.10811

Country:

North America > United States > Tennessee (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > New Zealand > South Island > Marlborough District > Blenheim (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Marine (1.00)
Transportation > Ground > Road (1.00)
(5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Do CIFAR-10 Classifiers Generalize to CIFAR-10?

Recht, Benjamin, Roelofs, Rebecca, Schmidt, Ludwig, Shankar, Vaishaal

arXiv.org Machine LearningJun-1-2018

Machine learning is currently dominated by largely experimental work focused on improvements in a few key tasks. However, the impressive accuracy numbers of the best performing models are questionable because the same test sets have been used to select these models for multiple years now. To understand the danger of overfitting, we measure the accuracy of CIFAR-10 classifiers by creating a new test set of truly unseen images. Although we ensure that the new test set is as close to the original data distribution as possible, we find a large drop in accuracy (4% to 10%) for a broad range of deep learning models. Yet more recent models with higher original accuracy show a smaller drop and better overall performance, indicating that this drop is likely not due to overfitting based on adaptivity. Instead, we view our results as evidence that current accuracy numbers are brittle and susceptible to even minute natural variations in the data distribution.

artificial intelligence, machine learning, new test, (19 more...)

arXiv.org Machine Learning

1806.00451

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Marine (1.00)
Transportation > Ground > Road (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI assistants say dumb things, and we're about to find out why

#artificialintelligenceMar-26-2018, 02:47:10 GMT

Siri and Alexa are clearly far from perfect, but there is hope that steady progress in machine learning will turn them into articulate helpers before long. A new test, however, may help show that a fundamentally different approach is required for AI systems to actually master language. Developed by researchers at the Allen Institute for AI (AI2), a nonprofit based in Seattle, the AI2 Reasoning Challenge (ARC) will pose elementary-school-level multiple-choice science questions. Each question will require some understanding of how the world works. The project is described in a related research paper (pdf).

artificial intelligence, machine learning, natural language, (6 more...)

#artificialintelligence

Industry: Education > Educational Setting (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.71)
Information Technology > Artificial Intelligence > Natural Language (0.53)

Add feedback